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Problems with shadow mapping? 

• Acne 

• Peter-panning 

• Aliasing 


Endless tuning to alleviate the above... © 






Aliasii 
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Ray tracing to the rescue... 
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Traditional Bounding Volume Hierarchy 

• Can skip many ray-triangle hit tests 

• Need to rebuild hierarchy on the GPU 

• For dynamic objects 

• Tree traversal is inherently slow 
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Storing Primitives for Ray-Tracing 

Without building a bounding volume hierarchy! 

• For shadow maps, store 
depth from light 

• Simple and coherent 
lookups 

• Store primitives similarly 

• A „Deep Primitive Map" 

• Store an array of front facing 
triangles per texel 



Nearest triangle normal 
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Deep Primitive Map Rendering #1 


NxNxd deep primitive map 


• Prim Count Map - how many triangles 
in this texel, use an atomic to count 
intersecting triangles 

• Prim Indices Map - index of triangle in 
prim buffer 


• This consists of 3 resources: 


Prim Count Map 
NxN 


• Prim Buffer - post transformed 
triangles 




Tune N & d per model 


Prim Indices Map 
NxNxof 


Prim Buffer 
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Deep Primitive Map Rendering #2 


• Is d large enough? 

• Visualize occupancy 

• Black: Empty 
. White: Full 

• Red: Limit exceeded 

• Easy to get this right 
for a known model 
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Deep Primitive Map Rendering #3 

• GS outputs 3 vertices & SV_PrimitiveID to PS 


[maxvertexcount (3) ] 

void Primitive_Map_GS ( triangle GS_Input IN [3], uint uPrimID : SV_PrimitiveID, inout TriangleStream<PS_Input> Triangles ) 

{ 

PS_Input 0; 

[unroll] 

for ( int i = 0; i < 3; ++i ) 

{ 

0 . f 3PositionWS0 = IN[0] . f 3PositionWS ; // 3 WS Vertices of Primitive 

0. f 3PositionWSl = IN [ 1 ] . f 3PositionWS ; 

O. f 3PositionWS2 = IN [2] . f 3PositionWS ; 

O. f4PositionCS = IN [ i ] . f 4PositionCS ; // SV_Position 

O. uPrimID = uPrimID; // SV_PrimitiveID 

Triangles . Append ( O ); 

} 

Triangles . RestartStrip ( ) ; 

} 


Tip: Use DX11.1 to output WS vertices directly to UAV 
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Deep Primitive Map Rendering #4 

• PS hashes draw call ID (shader constant) with 
SV_PrimitiveID to produce prim index/address 


float Primitive_Map_PS ( PS_Input IN ) : SV_TARGET 

{ 

// Hash draw call ID with primitive ID 

uint Primlndex = g_DrawCallOf f set + IN.uPrimID; 

// Write out the WS positions to prim buffer 
g_PrimBuff er [ Primlndex] . f 3PositionWS0 = IN . f 3PositionWS0 ; 
g_PrimBuffer [ Primlndex] . f 3PositionWSl = IN . f 3PositionWSl ; 
g_PrimBuffer [ Primlndex] . f 3PositionWS2 = IN . f 3PositionWS2 ; 

/ / Increment current primitive counter 
uint CurrentlndexCounter; 

InterlockedAdd ( g_IndexCounterMap [uint2 ( IN . f 4PositionCS . xy ) ] , 1 , CurrentlndexCounter ); 
// Write out the primitive index 

g_IndexMap [uint3 ( IN . f 4PositionCS . xy, CurrentlndexCounter)] = Primlndex; 


return 0; 
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Deep Primitive Map Rendering #5 

• Conservative raster is needed to capture 
all prims touching a texel 

• Can be done in SW or HW... 
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HW Conservative Raster #1 


• Rasterize every pixel touched by a triangle 

• Enabled in DirectX 12 & 11.3 



Off 


On 


Off 


On 
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HW Conservative Raster #2 

• 3 Tiers of functionality 

• See DirectX documentation 

• Be aware that a tier 1 CR can cull 
degenerate triangles post sub-pixel 
snapping 

• Solution: Ensure you snap triangles in a 
consistent way for ray tracing 
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SW Conservative Raster 

• Use the GS to dilate a triangle in clip space 

• Generate AABB to clip the triangle in the PS 

• See GPU Gems 2 - Chapter 42 
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For each screen pixel 
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Calc prim map coords (as for shadow 
mapping) 

Iterate over prim index array 

For each index fetch a triangle for 
ray testing 
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Ray-Tracing #2 


float Ray_Test ( float2 MapCoord, float3 f30rigin, float3 f3Dir, out float BlockerDistance ) 

{ 

uint uCounter = tlndexCounterMap . Load ( int3 ( MapCoord, 0 ), int2 ( 0, 0 ) ).x; 

[branch] 

if ( uCounter > 0 ) 

{ 

for( uint i = 0; i < uCounter; i++ ) 

{ 

uint uPrimlndex = tlndexMap . Load ( int4 ( MapCoord, i, 0 ) , int2 ( 0, 0 ) ) .x; 

float3 vO, vl, v2 ; 

Load_Prim( uPrimlndex, vO, vl, v2 ); 

// See "Fast, Minimum Storage Ray / Triangle Intersection" 

// by Tomas Moller & Ben Trumbore 
[branch] 

if ( Ray_Hit_Triangle ( f30rigin, f3Dir, vO, vl, v2 , BlockerDistance ) != O.Of ) 

{ 

return l.Of; 

} 

} 

} 

return O.Of; 

} 














Ray Trac 

SM = 3K x 3K (36 MB) 

PM = IK x IK x 64 ( 256 MB ) 
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Anti-Aliasing 

• Shoot additional rays to achieve this? 

• This is very expensive! 

• Simple trick - apply a screen space AA 
technique (FXAA, MLAA, etc.) 


If you're not cheating, you're just not trying 
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Are hard shadows that useful in 
games? 



GAME DEVELOPERS CONFERENCE® 2015 


MARCH 2-6, 2015 GDCONF.COM 


Hybrid Approach 

• Combine ray-traced shadow with conventional soft 
shadows 

• Use an advanced filtering technique such as CHS or PCSS 

• Use blocker distance to compute a lerp factor 


As blocker distance -> 0 ray-traced result is prevalent 



Lerp Factor Visualization 


L = saturate* BD / WSS * PHS ) 

L: Lerp factor 

BD: Blocker distance (from ray origin) 

WSS: World space scale - chosen based upon model 
PHS: Desired percentage of hard shadow 

FS = lerp( RTS, PCSS, L ) 

FS: Final shadow result 

RTS: Ray traced shadow result (0 or 1) 

PCSS: PCSS+ shadow result (0 to 1) 



Ler P factor 


Blocker distance = 0 
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Use a Shrinking Penumbra Filter 


• Otherwise the soft shadow result will not be fully 
contained by the ray traced result 

• This would cause problems when performing a lerp 
between the two 






Hybrid Ray Traced 
Standard Filter 




Hybrid Ray Traced* 
Shrinking Penu 







PCSS 

SM = 3K x 3K (36 MB) 




Hybrid Ray Trace 

SM = 3K x 3K (36 MB) 

PM = IK x IK x 64 ( 256 MB ) 



PCSS 

SM = 3K x 3K (36 MB) 
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Prims: ~10K 

Shadow Map: 3K x 3K (36 MB) 
Primitive Map: IK x IK x 32 (128 MB) 
Primitive Buffer: ~360K 
Shadow Buffer: 1920 x 1080 




AMD 
R9 290X 

NV 

GTX 980 

Primitive Map + 
HW CR 

— 

0.4 

Primitive Map + 
SW CR 

0.6 

0.5 

Ray Trace 

0.5 

0.4 

PCSS 

1.4 

1.3 

PCSS + Ray Trace 

1.9 

1.8 

FXAA 

0.3 

0.2 


Quoted times in milliseconds 
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Prims: ~65K 

Shadow Map: 3K x 3K (36 MB) 
Primitive Map: IK x IK x 64 (256 MB) 
Primitive Buffer: ~2.2 MB 
Shadow Buffer: 1920 x 1080 




AMD 
R9 290X 

NV 

GTX 980 

Primitive Map + 
HW CR 

— 

0.5 

Primitive Map + 
SW CR 

1.4 

0.7 

Ray Trace 

1.0 

0.7 

PCSS 

1.4 

1.3 

PCSS + Ray Trace 

3.0 

2.8 

FXAA 

0.3 

0.2 


Quoted times in milliseconds 
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Prims: ~240K 

Shadow Map: 3K x 3K (36 MB) 
Primitive Map: IK x IK x 64 (256 MB) 
Primitive Buffer: ~8.2 MB 
Shadow Buffer: 1920 x 1080 




AMD 
R9 290X 

NV 

GTX 980 

Primitive Map + 
HW CR 

— 

3.4 

Primitive Map + 
SW CR 

— 

4.1 

Ray Trace 

1.2 

1.0 

PCSS 

1.4 

1.3 

PCSS + Ray Trace 

3.7 

3.4 

FXAA 

0.3 

0.2 


Quoted times in milliseconds 
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Limitations 

• Currently limited to a single light source 

• This would not scale up to work for a 
whole scene 

• Storage would become the limiter 

• But is ideal for closest models: 

• Current model of focus 

• Contents of nearest cascade 
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Summary 

• Addresses conventional shadow map problems 

• AA ray-traced hard shadows are highly 
performant 

• Hybrid shadows combine best of both worlds 

• No need to re-write your engine 

• Fast enough for games today! 
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Questions? 


ions@nvidia.com 



